NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Automated Assessment of Students’ Code Comprehension using LLMs

Oli, Priti; Banjade, Rabin; Chapagain, Jeevan; Rus, Vasile (February 2024, MLResearchPress)

Full Text Available
Exploring The Effectiveness of Reading vs. Tutoring For Enhancing Code Comprehension For Novices

https://doi.org/10.1145/3605098.3636007

Oli, Priti; Banjade, Rabin; Lekshmi_Narayanan, Arun Balajiee; Brusilovsky, Peter; Rus, Vasile (April 2024, ACM Symposium on Applied Computing, SAC 2024)

This paper presents a comparison of two instructional strategies meant to help learners better comprehend code and learn programming concepts: reading code examples annotated with expert explanation (worked-out examples) versus scaffolded self-explanation of code examples using an automated system (Intelligent Tutoring System). A randomized controlled trial study was conducted with 90 university students who were assigned to either the control group (reading worked-out examples, a passive strategy) or the experimental group where participants were asked to self-explain and received help, if needed, in the form of questions from the tutoring system( scaffolded self-explanation, an interactive strategy). We found that students with low prior knowledge in the experimental condition had significantly higher learning gains than students with high prior knowledge. However, in the control condition, this distinction in learning outcomes based on prior knowledge was not observed. We also analyzed the effect of self-efficacy on learning gains and the nature of self-explanation. Low self-efficacy students learn almost twice as much in the interactive condition versus the passive condition although the difference was not significant probably because of low sample size. We also found that high self-efficacy students tend to provide more relational explanations whereas low self-efficacy students provide more multi-structural or line-by-line explanations.
more » « less
Full Text Available
The Behavior of Large Language Models When Prompted to Generate Code Explanations

Oli, Priti; Banjade, Rabin; Chapagain, Jeevan; Rus, Vasile (December 2023, Proceedings of the workshop on Generative AI for Education (GAIED) at the Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023))

This paper systematically investigates the generation of code explanations by Large Language Models (LLMs) for code examples commonly encountered in introductory programming courses. Our findings reveal significant variations in the nature of code explanations produced by LLMs, influenced by factors such as the wording of the prompt, the specific code examples under consideration, the programming language involved, the temperature parameter, and the version of the LLM. However, a consistent pattern emerges for Java and Python, where explanations exhibit a Flesch-Kincaid readability level of approximately 7-8 grade and a consistent lexical density, indicating the proportion of meaningful words relative to the total explanation size. Additionally, the generated explanations consistently achieve high scores for correctness, but lower scores on three other metrics: completeness, conciseness, and specificity.
more » « less
Explaining Code Examples in Introductory Programming Courses: LLM vs Humans

Lekshmi-Narayanan, Arun-Balajiee; Oli, Priti; Chapagain, Jeevan; Hassany, Mohammad; Banjade, Rabin; Brusilovsky, Peter; Rus, Vasile (February 2024, Workshop on AI for Education - Bridging Innovation and Responsibility at AAAI 2024)

Worked examples, which present an explained code for solving typical programming problems are among the most popular types of learning content in programming classes. Most approaches and tools for presenting these examples to students are based on line-by-line explanations of the example code. However, instructors rarely have time to provide explanations for many examples typically used in a programming class. In this paper, we assess the feasibility of using LLMs to generate code explanations for passive and active example exploration systems. To achieve this goal, we compare the code explanations generated by chatGPT with the explanations generated by both experts and students.
more » « less
Full Text Available
SelfCode: An Annotated Corpus and a Model for Automated Assessment of Self-Explanation During Source Code Comprehension

https://doi.org/10.32473/flairs.36.133385

Chapagain, Jeevan; Risha, Zak; Banjade, Rabin; Oli, Priti; Tamang, Lasang; Brusilovsky, Peter; RUs, Vasile (May 2023, The International FLAIRS Conference Proceedings)

The ability to automatically assess learners' activities is the key to user modeling and personalization in adaptive educational systems.The work presented in this paper opens an opportunity to expand the scope of automated assessment from traditional programming problems to code comprehension tasks where students are requested to explain the critical steps of a program. The ability to automatically assess these self-explanations offers a unique opportunity to understand the current state of student knowledge, recognize possible misconceptions, and provide feedback. Annotated datasets are needed to train Artificial Intelligence/Machine Learning approaches for the automated assessment of student explanations. To answer this need, we present a novel corpus called SelfCode which consists of 1,770 sentence pairs of student and expert self-explanations of Java code examples, along with semantic similarity judgments provided by experts. We also present a baseline automated assessment model that relies on textual features. The corpus is available at the GitHub repository (https://github.com/jeevanchaps/SelfCode).
more » « less
Full Text Available
When is Reading More Effective than Tutoring? An Analysis Through the Lens of Students' Self-Efficacy among Novices in Computer Science

https://doi.org/10.5281/zenodo.7761603

Oli, Priti; Banjade, Rabin; Narayanan, Arun Balajiee; Brusilovsky, Peter; Rus, Vasile (January 2023, Proceedings of 7th Educational Data Mining in Computer Science Education (CSEDM) Workshop at LAK 2023)

Self-efficacy, or the belief in one's ability to accomplish a task or achieve a goal, can significantly influence the effectiveness of various instructional methods to induce learning gains. The importance of self-efficacy is particularly pronounced in complex subjects like Computer Science, where students with high self-efficacy are more likely to feel confident in their ability to learn and succeed. Conversely, those with low self-efficacy may become discouraged and consider abandoning the field. The work presented here examines the relationship between self-efficacy and students learning computer programming concepts. For this purpose, we conducted a randomized control trial experiment with university-level students who were randomly assigned into two groups: a control group where participants read Java programs accompanied by explanatory texts (a passive strategy) and an experimental group where participants self-explain while interacting through dialogue with an intelligent tutoring system (an interactive strategy). We report here the findings of this experiment with a focus on self-efficacy, its relation to student learning gains (to evaluate the effectiveness, we measure pre/post-test), and other important factors such as prior knowledge or experimental condition/instructional strategies as well as interaction effects
more » « less
Full Text Available
Preliminary Experiments with Transformer based Approaches To Automatically Inferring Domain Models from Textbooks

Banjade, Rabin; Oli, Priti; Tamang, Lasang Jimba; Rus, Vasile (July 2022, Proceedings of the 15th International Conference on Educational Data Mining)

Domain modeling is a central component in education technologies as it represents the target domain students are supposed to train on and eventually master. Automatically generating domain models can lead to substantial cost and scalability benefits. Automatically extracting key concepts or knowledge components from, for instance, textbooks can enable the development of automatic or semi-automatic processes for creating domain models. We explore in this work the use of transformer based pre-trained models for the task of keyphrase extraction. Specifically, we investigate and evaluate four different variants of BERT, a pre-trained transformer based architecture, that vary in terms of training data, training objective, or training strategy to extract knowledge components from textbooks for the domain of intro-to-programming. We report results obtained using the following BERT-based models: BERT, CodeBERT, SciBERT and RoBERTa.
more » « less
Full Text Available
Automated Assessment of Quality of Jupyter Notebooks Using Artificial Intelligence and Big Code

https://doi.org/10.32473/flairs.v34i1.128560

Oli, Priti; Banjade, Rabin; Tamang, Lasang Jimba; Rus, Vasile (May 2021, The International FLAIRS Conference Proceedings)

We present in this paper an automated method to assess the quality of Jupyter notebooks. The quality of notebooks is assessed in terms of reproducibility and executability. Specifically, we automatically extract a number of expert-defined features for each notebook, perform a feature selection step, and then trained supervised binary classifiers to predict whether a notebook is reproducible and executable, respectively. We also experimented with semantic code embeddings to capture the notebooks' semantics. We have evaluated these methods on a dataset of 306,539 notebooks and achieved an F1 score of 0.87 for reproducibility and 0.96 for executability (using expert-defined features) and an F1 score of 0.81 for reproducibility and 0.78 for executability (using code embeddings). Our results suggest that semantic code embeddings can be used to determine with good performance the reproducibility and executability of Jupyter notebooks, and since they can be automatically derived, they have the advantage of no need for expert involvement to define features.
more » « less
Full Text Available
A Comparative Study of Free Self-Explanations and Socratic Tutoring Explanations for Source Code Comprehension

https://doi.org/10.1145/3408877.3432423

Tamang, Lasang Jimba; Alshaikh, Zeyad; Khayi, Nisrine Ait; Oli, Priti; Rus, Vasile (March 2021, Proceedings of the 52nd ACM Technical Symposium on Computer Science Education)
null (Ed.)
We present in this paper the results of a randomized control trial experiment that compared the effectiveness of two instructional strategies that scaffold learners' code comprehension processes: eliciting Free Self-Explanation and a Socratic Method. Code comprehension, i.e., understanding source code, is a critical skill for both learners and professionals. Improving learners' code comprehension skills should result in improved learning which in turn should help with retention in intro-to-programming courses which are notorious for suffering from very high attrition rates due to the complexity of programming topics. To this end, the reported experiment is meant to explore the effectiveness of various strategies to elicit self-explanation as a way to improve comprehension and learning during complex code comprehension and learning activities in intro-to-programming courses. The experiment showed pre-/post-test learning gains of 30% (M = 0.30, SD = 0.47) for the Free Self-Explanation condition and learning gains of 59% (M = 0.59,SD = 0.39) for the Socratic method. Furthermore, we investigated the behavior of the two strategies as a function of students' prior knowledge which was measured using learners' pretest score. For the Free Self-Explanation condition, there was no significant difference in mean learning gains for low vs. high knowledge students. The magnitude of the difference in performance (mean difference= 0.02,95% CI: -0.34 to 0.39) was very small (eta squared = 0.006). Likewise, the Socratic method showed no significant difference in mean learning gains between low vs. high performing students. The magnitude of the performance difference (mean difference =-0.24,95% CI: -0.534 to 0.03) was large (eta squared = 0.10). These findings suggest that eliciting self-explanations can be used as an effective strategy and that guided self-explanations as in the Socratic method condition is more effective at inducing learning gains.
more » « less
Full Text Available
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Srivastava, Aarohi; Rastogi, Abhinav; Rao, Abhishek; Shoeb, Abu Awal; Abid, Abubakar; Fisch, Adam; Brown, Adam R.; Santoro, Adam; Gupta, Aditya; Garriga-Alonso, Adri; et al (January 2023, Transactions on machine learning research)

Full Text Available

Search for: All records